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Abstract. - We provide a scheme for exploring the reconstruction limits of compressed sensing 
by minimizing the general cost function under the random measurement constraints for generic 
correlated signal sources. Our scheme is based on the statistical mechanical replica method for 
dealing with random systems. As a simple but non-trivial example, we apply the scheme to a 
sparse autoregressive model, where the first differences in the input signals of the correlated time 
series are sparse, and evaluate the critical compression rate for a perfect reconstruction. The 
results are in good agreement with a numerical experiment for a signal reconstruction. 



Introduction. — Compressed sensing (CS) is a novel 
technique for data compression and has been drawing a lot 
qf attention recently from the viewpoints of both theory 
and application. The key idea behind CS is to utilize the 
sparsity of the original input signals as the prior knowledge 
during the signal reconstruction stage, which can signifi- 
cantly reduce the number of signal measurements required 
for a perfect reconstruction. This setup is realistic because 
"vye often have to face situations where we have to handle 
sparse signals in the real world. A lot of effort has been 
paid and significant progress has been made in investi- 
gating the properties of CS [I}^3]. After the pioneering 
works, contribution to CS problem from statistical me- 
chanics analysis is now growing rapidly PHlOj. 

The measurement process of CS is summarized in the 
following linear equation: 



(1) 



The vectors and matrices are denoted in bold in this ar- 
ticle. The input signal vector is A^-dimensional and 
the compressed signal vector y £ M.^ is P-dimensional. F 
is a P-by-A'^ compression matrix. In this article, we par- 
ticularly focus on random measurements, in which each 
F, F^i entry is independently and identically distributed 
(i.i.d.) from a Gaussian distribution of the zero mean 
and variance N~^. The compression rate is defined by 



a = P/N < 1. 

In earlier theoretical studies, the critical compression 
rate ac for perfectly reconstructing a;" from y has been 
actively assessed for various reconstruction schemes under 
the assumption that the input signal vector x^ is sparsely 
modeled by the distribution. 



(l-p)<5(^0)+pP(x^), 



(2) 



within the large system limit of N,P — > oo keeping 
a = P/N constant [IHl]. Here, P{x^) is a given probabilis- 
tic distribution and p denotes the density of the non-zero 
elements. In particular, the assessment for the reconstruc- 
tion scheme for minimizing the so-called £i-norm 



minimize 



El 



subject to y{= Fx°) = Fx, (3) 



which is termed the £i-norm reconstruction hereafter, has 
drawn a lot of attention because of its computational fea- 
sibility and robustness to measurement noise. In this re- 
gard, it may be surprising that a mathematically rigor- 
ous method of combinatorial geometry [2] and the replica 
method for statistical mechanics [3] provide an identical 
Uc value although the methodological equivalence between 
the two schemes has not really been clarified yet. In addi- 
tion, the value of ac seems rather universal pi llllIT^ : Uc 
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is unchanged as long as follows ^ and F"^F, where T 
denotes the matrix transpose, asymptotically obeys a ro- 
tationally invariant ensemble. However, the necessary and 
sufficient condition for the universality is still also open. 

The main purpose of this article is to offer a methodolog- 
ical basis for exploring this universality using the replica 
method. For this objective, we evaluate ac for general 
correlated distributions of a;", where P(a3°) is a joint dis- 
tribution with sparsity and not necessarily factorizable to 
each P(a;^), and the reconstruction schemes provided as 

minimize E{x) subject to y(= Fx'^) = Fx, (4) 

where E{x) is a generic cost function. For simplicity, we 
assume that each entry of F, F^i, is an i.i.d. Gaussian 
random number of the zero mean and variance N~^. How- 
ever, as shown later, situations in which x'^ is expanded 
by the i.i.d. coefficients sampled from ^ using a certain 
basis S can be cast to those of the correlated i^^i for i.i.d. 
signals sampled from ([2]). Namely, our analysis practically 
covers correlated compression matrices as well [9|. 

In addition to the theoretical interest, exploring the 
above setting is also significant for practical relevance. In 
most real world problems, the signals may be redundant 
in an information theoretic sense, but are not necessarily 
expressed as sparse upon first sight. In addition, in order 
to appropriately deal with such real world signals, certain 
cost functions other than the naive £i-norm of ([3]), such as 
the total variation (TV) |I3| . are widely used in practice 
for reconstructing signals. Our generic assumptions con- 
cerning the correlated distributions of the signal sources 
and cost functions for the signal reconstruction are in- 
tended to extend the analysis of the performance measure 
of compressed sensing, for more practically plausible 
scenarios beyond the simple cases of i.i.d. sparse sources 
and component-wise cost functions. 

Replica analysis: A general guideline. Here we 
sketch an outline of our analysis. This analysis is simi- 
lar to that of the recent study regarding CS for correlated 
compression matrices [3] and that of the correlated chan- 
nel in wireless telecommunication systems jl41ll5j . The 
technical details can be found in these references. 

Following the basic scenario in [J , let us define the key 
quantity for our analysis, which plays the role of free en- 
ergy in statistical mechanics and represents the typical 
value (per element) of the minimized cost (|4]) in the cur- 
rent context. 



C ^ - lim _[lnZ(/3,y)]f,.o 

= - hm lim lim ^ ln[Z"(/3, y)]^ , (5) 

where Z{f3,y) = J dx e^p{-PE{x))S{F{x - x°)) is the 
partition function and [•••]:>(■ generally denotes the average 
with respect to random variable X. Taking the limit f3 — > 
oo works for singling out the solution of Q in the partition 



function. Unfortunately, assessing y)]p'^o for Vti £ 

M in ([5]) is technically difficult. For resolving this difficulty, 
we evaluate analytical expressions of [Z"'{/3,y)]p ,j.o with 
respect to Vrt e N using the identity 



/7l 
n dx- exp{-l3E{x''))6{F{x'' - 33°)), 
a=l 



(6) 



which is valid only for n G N, and employ the obtained 
expressions for assessment of ([5]) assuming that they hold 
for Vn e M as well. This is often termed the replica 
method as integration variables a;" (a = I, 2, . . . , n) in (|6]) 
are regarded as n "replicas" of the original state variable 
x. For this, we analytically calculate the average of the 
right hand side of (jB]) employing the saddle-point method 
with respect to macroscopic variables qab = N~^(x°'Y'X^ 
and nria = N~^{x^)'^ x"- , which is justified as P, TV ^ I. 
The intrinsic invariance of ([6]) under any permutations of 
replica indices a = 1,2, ... ,n leads to the replica symmet- 
ric (RS) ansatz, which means that the dominant saddle 
point also possesses this property as qaa = Qab ~ q 
(a 7^ b) and ma = rn. This reproduces the mathemati- 
cally rigorous results for the basic model |4]. Therefore, 
we here also adopt this ansatz, validity of which will be 
checked later. The saddle point solution obtained under 
the RS ansatz seems to hold for n G K as well. Employing 
this in the right hand side of ^ yields an expression 



C 



= Extr 

q,m,x,Q,m,x 




(7) 



Here u} = (w;) = mcc" + ■\/xz, Extre{---} denotes 
the extremization of • • • with respect to Q, P{x'^) 
is the generic A^-dimensional distribution of the origi- 
nal signal a;°, and u ~ J dx°P{x'^)\x'^\^ denotes 
the second moment (per element) of the original sig- 
nal. Dz stands for the A'^-dimensional Gaussian measure 
(27r)"^/2n,=id2,cxp(-z2/2). The function (j)ih,Q) is 
defined by the minimization including the N variables as 



(t){h, Q)^j^ mm I ^x^x 



h^x 



E{x) 



(8) 



With regard to the final expressions ([7]) and ([H, three 
points are worthwhile to note. First, the right hand side 
of ([5]), in conjunction with substitution oi h = u) and Q 
as provided by (O, stands for the problem statistically 
equivalent to the original one This means that ran- 
dom constraints y = Fx of (|4|), in which multiple vari- 
ables are coupled with one another, can be handled as a 
bunch of decoupled extra random costs {Q/2)x1 — ujiXi 
(i = 1,2,..., N) in the performance assessment of large 
systems. Such correspondence is sometimes termed "de- 
coupling principle" in information theory literature |16) . 
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Second, the values of q and m determined by the extrem- 
ization condition of the right hand side of ([7]) represent the 
typical values of the averages of N^^x^x and N^^{x'^)'^x 
with respect to the uniform distribution of the solutions 
of (HI, respectively. If and only if the solutions typically 
accorded to x'^ allowing negligible errors per component 
in TV — > oo, the solution for q ~ m = u is thcrmody- 
namically dominant, implying that the reconstruction is 
typically successful. Therefore, one can characterize etc 
as a transition condition at which the successful solution 
q = m = u loses its thermodynamic dominance. When 
E{x) is convex downward, which is often the case in prac- 
tice, this can be examined by assessing the local stability 
of q = m = u since (HJ) is guaranteed to possess a unique 
solution. It might also be noteworthy that our criterion for 
a successful reconstruction is different from that of earlier 
mathematical studies [IH3] in which no errors were per- 
mitted. However, we expect that such differences are ir- 
relevant in the etc assessment as was the case for the basic 
problems of © and © g] . 

The final point is the computational cost for carrying 
out the above assessment. Although the average with re- 
spect to F has already been analytically taken into ac- 
count, those with respect to x^ and auxiliary random 
numbers z still remain in the expression In prac- 

tice, this should be assessed using a Monte Carlo sampling 
method for sufficiently large N and P, which in principle 
can offer arbitrarily accurate estimates of the averages in 
the large system limit iV, P — > oo (under the assumption 
that a certain thermodynamic limit exists). Therefore, 
the computational cost for performing the Monte Carlo 
sampling practically determines the feasibility. There are 
two possible sources for the computational difficulty. The 
first one is the computational cost for generating a;" fol- 
lowing iV-dimensional distribution P(x'^), which generally 
grows exponentially with respect to N. However, when 
x° can be expressed as a?" = Sx', where S and x' are a 
fixed matrix and a vector sampled from a computationally 
feasible distribution, respectively, generating x*^ is not a 
crucial problem for standard computational resources to 
date. This is also the case for z. The other difficulty could 
come out in numerically performing a minimization with 
respect to a; in However, when E{x) is convex, which 
we are assuming, the cost function on the right hand side 
of ^ is guaranteed to be convex as well. This indicates 
that one can also avoid a computational explosion using 
various schemes known for convex optimization |17lll8j in 
assessing ([5]). Furthermore, when the variable dependence 
of E{x) is pictorially expressed as a graph free from cycle, 
one may be able to use more efficient algorithms for the 
minimization [19j . These imply that although performing 
the developed method is generally computationally diffi- 
cult, it is still practically useful in the performance analysis 
for certain non-trivial classes of CS problems. In the next 
part, this is illustrated through application to time series 
data signals that are characterized by the sparsity con- 
cerning the difference between signals of successive times. 



Application: A sparse autoregressive model. — 

Model definition. For illustrating the utility of the 
developed scheme, we focus on the time series data signals 
generated from the use of the autoregression process of the 
first order with sparsity (sparse AR(1) model, denoted by 
SAR(l) in the following). A SAR(l) process is defined by 
the stochastic recurrence equation 

_ / rx'^ + Vl - r'^Vi with prob. p, , . 

^+1 \ x° with prob. l-p, ^ ' 

where < r, p < 1. We assume that random variable 
iji at each time i, including the first signal xl, is inde- 
pendently drawn from the normal Gaussian distribution 
A/'(0,1). Equivalently, this process is represented by the 
conditional probability of the signal at time i given a state 
at time i — 1, x'^_i, as 

P(x°|x^i) = il-p)Six^-xl,) 

The CS of this process has already been investigated from 
algorithmic point of view [5D]. Here we address the crit- 
ical compression rate ac of the signals from this process 
by the replica analysis. Although for simplicity reasons 
we focus on SAR(l) in the current article, extending the 
following argument to that of the fc-th order, SAR(fc), is 
straightforward . 

This model is considered as a special Gaussian mixture 
transition distribution model proposed by Le et al. for han- 
dling the non-Gaussian and nonlinear features of a time 
series in a unified framework [21] [22]. In ([9]) and ([TO)) . 
r represents a parameter of the autoregression satisfying 

< r < 1, while p (0 < p < 1) stands for a density pa- 
rameter with respect to the difference in signals between 
successive times. An example of the signals from SAR(l) 
is depicted in figure [1] For p = 1 this process is reduced to 
a normal autoregressive model of the first order, and for 
p < 1 the signal at time i pauses for the same state as the 
one in the previous time step i — 1 with a finite probability 

1 — p. Therefore, SAR(l) of p < 1 typically generates a 
time series that has a lot more pausing states than usual 
autoregressive models. This property may be suitable for 
modeling various kinds of time series data such as acoustic 
signals [53], the exploratory behavior of a house fly [53], 
the financial time series |25| . and more. 

In SAR(l), the signal differences are sparse but the sig- 
nals themselves are dense. This indicates that using the 
nai've ^i-norm as a cost function for the signal reconstruc- 
tion is not promising for improving the reconstruction per- 
formance. Instead, it may be reasonable to choose the cost 
function E{x) as £i-norm for the signal differences, namely 

minimize I Xi+i — I subject to y{= Fx^) = Fx, 

i 

(11) 
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Fig. 1; Example of data signal from SAR(l). The cases of p — 
0.5, r = (solid), p = 0.5, r = 0.5 (broken), and p — l,r = 
(dotted) are shown here. 



in terms of striking a balance between the statistical accor- 
dance to the original signals and computational feasibility. 

Defining a vector of the signal differences as x'^ = 
Xi — Xi-i (and x'l = xi) formally converts (jlip into an ex- 
pression of the naive ^i-norm reconstruction for x' = (x^) 
subject to the constraint y = F'x' offered by a modified 
compression matrix F' ~ FS, where S ~ {Sij ) is provided 
as Sij = 1 for i > j and vanishes, otherwise. Although F' 
is also a certain random matrix, the ensemble of {F')"^ F' 
is no more rotationally invariant as [{F')^ F']f = S"^ S 
holds true. Therefore, one cannot apply the results of ear- 
lier studies for the basic settings [IHl] to the analysis of 
SAR(l), as was pointed out in 

Saddle point equation and critical condition. Let us 
evaluate the critical reconstruction limit of SAR(l) by us- 
ing the scheme developed in the preceding part for pO[) 
and (|TT|) . Note that (|11D is described by a spin chain 
with random fields. Similar problems have been ana- 
lyzed in Extremization of ([7]), hi conjunction 
with the substitution of P{x°) = 11^=1 where 
P{xi\xq) = exp(— .T^/2)/\/27r, yields a set of saddle point 
equations, as 




where Dz = dzexp (— /\/27r and the A^-dimensional 
vector X* = (a;* (a;, Q j) is determined by 

—(l){uj, Q) = [Qx* - mx") - y^Zi 

+sgn(.T* - a;*+i) -|- sgn(a;* - = (13) 

(i = 1,2,..., N), which corresponds to the minimization 
condition of Q). sgn(a;) = x/\x\ for x ^0. 

For a sufficiently large a given r and p, the set of equa- 
tions ([T^ allows for the following solution: x ~^ +0, 
Q ~ fh ^ +00, Q = m ^ u and £ ~ This is 

because the third to fifth terms in ([T^]) are negligible com- 
pared to the first and second ones if \x* — x^l ~ 0(1) as 
Q ^ fh ^ +00 while x is kept at 0(1), and therefore, 
x*{u>, Q) x'j (j = 1, 2, . . . , N) holds in (HSl). This solu- 
tion represents nothing but a successful reconstruction. 

The critical reconstruction rate ac is determined by 
the local instability condition of this solution, which is 
summarized as the condition for preventing the behav- 
ior of X — >■ -1-0. In order to accurately evaluate this, 
we pay attention to the infinitesimal differences between 
X* and x^ by introducing the novel variables Xi{^\fxz) = 
\im^^+o{a/x){x* - xO) (z = 1, 2, . . . , N). Rewriting ^ 
using these variables within the limit of x ~^ +0 and ex- 
ploring the local stability condition of x — 5> +0 yield a set 
of equations for determining the reconstruction limit ac, 




- \f%Zi -I- sgn(e(xi - Xi+i) -t- a;° - a;°+i) 

-f sgn(e(S, - x^-x) +x°- x^^^) = 0, (14) 



where e > is a sufficiently small positive constant. We 
also checked the local stability against the disturbance 
that breaks the replica symmetry [28j . which gives the 
stability condition as 




(15) 

It may be noteworthy that this accords to that for the 
dynamical stability of the successful solution a;* = a;° con- 
cerning a belief propagation based algorithm for solving 
(fTT|) . A similar accordance has been observed in another 
system before [29] . 

Unfortunately, the off-diagonal contributions of 
{dx*{uj,Q)/duJk)^ {j 7^ k) always prevent the solution of 
([H]) from satisfying ([T5|). This implies the necessity of 
exploring the replica symmetry breaking (RSB) solutions 
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for accurately assessing Uc- However, we still speculate 
that the RS estimate at least offers a fairly good approxi- 
mation since the deviation from the results of numerical 
experiments shown later is considerably small. This 
speculation is also supported by the fact that the RS 
assessment provides the correct estimate of ac of the Iq 
recovery scheme for the basic model in spite that the RS 
solution is locally unstable for the RSB disturbance [J]. 

In the evaluation of the reconstruction limit ac, multi- 
ple integrals in the first and the second equations in (jl4p 
should be performed. This can be done in practice by 
using a Monte Carlo method. Particularly in the current 
case, this scheme works very efficiently because the sub- 
routine for determining Xi{^/xz), which is expressed as 
the third equation, can be carried out by using only the 
0{N) computational cost for a given pair of and z 
with making use of the belief propagation (equivalently, 
transfer matrix method or dynamic programming) |19j . 

Monte Carlo assessment of ac and experimental valida- 
tion. We evaluated the reconstruction limit ac by itera- 
tively solving p^ . For numerical stability, we solved the 
equation by converting the coordinates of the variables as 
x' = S^^x. We set the length of a;° to TV = 2 x 10^ and 
took 10'^ (figure 121) or 10^ (figure [3]) sample averages for 
the numerical evaluation of ac- Making N much larger is 
practically difficult due to the slow convergence of the iter- 
ation under the sample fluctuations. However, we judged 
that the signal length of iV = 2 x 10'^ was large enough for 
the evaluation of ac because the change in ac evaluated 
for = 1 X 10"^ was smaller than the value of the typical 
sample fluctuations. 

The reconstruction limit as a function of p and r is de- 
picted in figure [21 For a fixed r (top panel) , ac behaves 
as a convex upward function of p similarly to that for the 
case of the basic setting (dotted curve) [3] . When compar- 
ing this with the results from the basic setting of the i.i.d. 
sparse signals in [J], where ac = 0.8312 ... is evaluated for 
p = 0.5, the value of the reconstruction limit for r = 0, 
which corresponds to cases where there were no time cor- 
relations except for the pausing, is larger (bottom panel). 
This implies that the reconstruction limit does depend on 
the types of sparsity and that the sparsity of the signal dif- 
ferences is not as useful as that for the signals themselves 
in reducing the data size. With regard to the autoregres- 
sion parameter r, a decrease in a^ or the equivalent im- 
provement of the reconstruction performance is observed 
as r is increased (bottom panel) . This is plausible because 
the correlations generally decrease the information quan- 
tity of the signals, which in principle makes it possible to 
reduce the data size. 

To verify the obtained results, we also conducted nu- 
merical experiments. In the experiments, ac was numer- 
ically assessed as follows: In a trial, we first prepared an 
N X N random compression matrix F, and deleted the 
rows of the matrix one-by-one until the signal reconstruc- 
tion failed. A failure was judged when \x^ — x'^\ > 10^"*, 
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Fig. 2: Reconstruction limit Oc for signal from SAR(l) as func- 
tion of p with r = 0.5 (top) or r with p — 0.5 (bottom). In the 
top figure, the dependence on p (cross) is almost the same as 
the basic setting examined in [4] (dotted curve). 



where «U j|( IS the reconstructed vector, was first satisfied, 
and the value Pc = P + 1, where P is the number of rows 
when the reconstruction failure, was recorded. We used 
the convex optimization package for MATLAB developed 
in [TTKIH] to search for a;». For each A^, this trial was re- 
peated 10^ times, and the typical reconstruction limit for 
a finite A^, ac{N), was assessed as q;c(A^) = Pc/N, where 
denotes the arithmetic average over the 10^ trials. Fi- 
nally, the critical value of A — > oo was evaluated by using 
the quadratic fitting with respect to A^~^ to ac{N). 

The results are summarized in figure[31 where the depen- 
dence of ac{N) on the signal length A^ is depicted for r = 
and r = 0.5 with p = 0.5. A decrease of the reconstruction 
limit ac (or improvement of reconstruction performance) 
for a larger r is observed as expected from the replica 
analysis. In order to compare this with the reconstruction 
limit from the replica analysis, we also performed a scaling 
analysis using a quadratic function regression and extrap- 
olated the result to A^ oo. which gives ac — 0.8485(3) 
for r = and ac ~ 0.8406(3) for r = 0.5. The recon- 
struction limits for A^ — > oo from the extrapolation are 
reasonably close to the values from the replica analysis 
{ac = 0.8491(2) for r = and ac = 0.8412(1) for r = 0.5 
respectively), considering possible biases which come out 
due to infiuences of higher order terms of A^"^ in the data 
fitting, which validates our analysis based on the statisti- 
cal mechanical scheme. 

Summary and discussion. In summary, we have 
developed a scheme to assess the typical reconstruction 
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Fig. 3: Reconstruction limit «c for SAR(l) of finite dimension 
A'' from reconstruction experiment. Tfie results for r = (+) 
and r = 0.5 (x) when p = 0.5 are shown. The curves indicate 
the result of the quadratic function regression for the depen- 
dence of (inverse of) input signal dimension N (solid for r — 
and broken for r = 0.5). The two horizontal lines indicate 
the results of a replica analysis for r = (solid) and r = 0.5 
(broken) , respectively. 



limit of compressed sensing problems that arc defined by 
the generic signal sources and cost functions under the as- 
sumption of random measurements. Although the scheme 
is computationally difficult in general, it is still of prac- 
tical utility when the source distribution is computation- 
ally feasible and the cost function is convex downward. As 
an example for showing the utility, we have taken up the 
problem of sparse autoregression and have examined how 
ac depends on two system parameters that specify the au- 
toregression process. Our investigation has indicated that 
the sparsity of the signal differences between successive 
times is not as useful as that of the signals themselves for 
compressing the data size. 

In earlier studies [51[TTJ[T2], the universality of ac has 
been observed for i.i.d. sparse sources as long as the cross 
correlation matrix (F)"^ F of the random compression ma- 
trix F asymptotically obeys a rotationally invariant en- 
semble. The problem of the sparse autoregression of van- 
ishing correlation parameter (r = 0) can be cast to the 
cases of the i.i.d. sources in which the {F)'^F ensemble is 
not asymptotically rotationally invariant. Our result in- 
dicates that applying the theoretical results obtained for 
random compression matrices and i.i.d. sources to realis- 
tic problems requires a certain care because either/both 
F or/and the original signals can contain non-negligible 
correlations in most real world problems. 

Exploring a more realistic time series modeled by 
SAR(fc) (fc > 2), two dimensional signals (images) is in- 
cluded in our future plan. Besides, compressed sensing 
with noise is also significant for application. Its perfor- 
mance can be analyzed by the generalization of our for- 
malism, which is also a promising future work. 
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